Cyber Crime
Malicious activities like identity theft, harassment and phishing activities are conducted by the cyber criminals by making use of the anonymous context of the cyber world to their advantage. Phishing scams are conducted in such a manner by the scammers that websites are created by them and emails are sent out in order to trick the account holders into revealing sensitive information like passwords and account numbers. These crimes are usually solved by the investigators in such a manner that they back trap the IP addresses on the basis of the data which is present in the header of these anonymous emails. Although, at times the information which is gathered from the IP address isn't enough to identify the culprit in case that the information is sent from a proxy server or if the computer used to send the email has more than one user (Fouss et al., 2010).
The main problem with anonymity is that the authorship analysis techniques are used to address the online communication. There is a long history associated with the study of authorship when it comes to solving authorial disputes like poetic and historic work, although, in case of the online textual communication the study of authorship analysis is very restricted. The reason behind this is the fact that in the traditional written works there is a lot of data which is written in a very well-structured manner by making use of the grammatical rules and common syntactic. Whereas, in comparison to this the online documents like the instant messages and emails are written in a short and poorly structured manner, these are written mostly in the paragraph language and have a lot of grammatical and spelling mistakes. Due to these differences some of the features of authorship analysis can't be applied to the online textual data (Abbassi and Chen, 2009).
Following are the three main authorship analysis problems that have been addressed in this paper.
Firstly, identification of authorship with the large training samples: take this situation where from among a group of suspects a cybercrime investigator wants to pinpoint a likely author of a particular anonymous text message; it is assumed by us that a huge collection of messages which have previously been written by the suspects are available to the investigator. In the actual investigations, the sample text messages can be gotten from the chat logs or email archives of a suspect's personal computer with the help of a warrant. This is done in order to get a sample of the writing styles of each suspect. A large amount of the previous work that has been done on the authorship identification assumes that there is only one writing style followed by every suspect. An argument has been presented by us regarding the changing in writing style depending on the nature of topic. The challenge that we face here is regarding the identification of particular stylistic variations and making use of these variations in order to bring improvement in the authorship identification's accuracy (Fouss et al., 2010).
Secondly, identification of authorship with small training samples: There are a number of anonymous messages that have been given to a cybercrime investigator, a group of suspects is present as well and the investigator wants to correctly identify the author of every one of those anonymous messages. In this problem the assumption is that there is only a small number of training samples that the investigator has access to. The challenge here is to make identification on the basis of the inadequate training data by finding specific patterns (Fouss et al., 2010).
Thirdly, authorship characterization: In this scenario the cybercrime investigator has a collection of the anonymous text messages but he/she doesn't have any idea about the probable suspects and for this reason there are no training samples available of the suspects either. Still the investigator would prefer concluding some of the characteristics like age group, ethnicity and genders of the authors. This will be done by the investigator by observing the writing styles of the authors. The assumption here is that there are some external sources of the text messages like social network websites or blog postings that the investigator has access to. The challenge here is how to make use of these kinds of external sources in order to deduce the characteristics of these authors (Abbassi and Chen, 2009).
Literature review
The linguistic and computational characteristic of the documents written by individuals is known as the authorship analysis. Extracting the particular writing traits or writing styles from the written documents of an individual can be made use of in order to distinguish one person from another. There are 5 main categories...
Our semester plans gives you unlimited, unrestricted access to our entire library of resources —writing tools, guides, example essays, tutorials, class notes, and more.
Get Started Now